428 research outputs found
Trinets encode tree-child and level-2 phylogenetic networks
Phylogenetic networks generalize evolutionary trees, and are commonly used to
represent evolutionary histories of species that undergo reticulate
evolutionary processes such as hybridization, recombination and lateral gene
transfer. Recently, there has been great interest in trying to develop methods
to construct rooted phylogenetic networks from triplets, that is rooted trees
on three species. However, although triplets determine or encode rooted
phylogenetic trees, they do not in general encode rooted phylogenetic networks,
which is a potential issue for any such method. Motivated by this fact, Huber
and Moulton recently introduced trinets as a natural extension of rooted
triplets to networks. In particular, they showed that level-1 phylogenetic
networks are encoded by their trinets, and also conjectured that all
"recoverable" rooted phylogenetic networks are encoded by their trinets. Here
we prove that recoverable binary level-2 networks and binary tree-child
networks are also encoded by their trinets. To do this we prove two
decomposition theorems based on trinets which hold for all recoverable binary
rooted phylogenetic networks. Our results provide some additional evidence in
support of the conjecture that trinets encode all recoverable rooted
phylogenetic networks, and could also lead to new approaches to construct
phylogenetic networks from trinets
FlatNJ: A novel network-based approach to visualize evolutionary and biogeographical relationships
Split networks are a type of phylogenetic network that allow visualization of conflict in evolutionary data. We present a new method for constructing such networks called FlatNetJoining (FlatNJ). A key feature of FlatNJ is that it produces networks that can be drawn in the plane in which labels may appear inside of the network. For complex data sets that involve, for example, non-neutral molecular markers, this can allow additional detail to be visualized as compared to previous methods such as split decomposition and NeighborNet. We illustrate the application of FlatNJ by applying it to whole HIV genome sequences, where recombination has taken place, fluorescent proteins in corals, where ancestral sequences are present, and mitochondrial DNA sequences from gall wasps, where biogeographical relationships are of interest. We find that the networks generated by FlatNJ can facilitate the study of genetic variation in the underlying molecular sequence data and, in particular, may help to investigate processes such as intra-locus recombination. FlatNJ has been implemented in Java and is freely available at www.uea.ac.uk/computing/software/flatnj
On Patchworks and Hierarchies
Motivated by questions in biological classification, we discuss some
elementary combinatorial and computational properties of certain set systems
that generalize hierarchies, namely, 'patchworks', 'weak patchworks', 'ample
patchworks' and 'saturated patchworks' and also outline how these concepts
relate to an apparently new 'duality theory' for cluster systems that is based
on the fundamental concept of 'compatibility' of clusters.Comment: 17 pages, 2 figure
Metatranscriptomes from diverse microbial communities: assessment of data reduction techniques for rigorous annotation
Background Metatranscriptome sequence data can contain highly redundant sequences from diverse populations of microbes and so data reduction techniques are often applied before taxonomic and functional annotation. For metagenomic data, it has been observed that the variable coverage and presence of closely related organisms can lead to fragmented assemblies containing chimeric contigs that may reduce the accuracy of downstream analyses and some advocate the use of alternate data reduction techniques. However, it is unclear how such data reduction techniques impact the annotation of metatranscriptome data and thus affect the interpretation of the results. Results To investigate the effect of such techniques on the annotation of metatranscriptome data we assess two commonly employed methods: clustering and de-novo assembly. To do this, we also developed an approach to simulate 454 and Illumina metatranscriptome data sets with varying degrees of taxonomic diversity. For the Illumina simulations, we found that a two-step approach of assembly followed by clustering of contigs and unassembled sequences produced the most accurate reflection of the real protein domain content of the sample. For the 454 simulations, the combined annotation of contigs and unassembled reads produced the most accurate protein domain annotations. Conclusions Based on these data we recommend that assembly be attempted, and that unassembled reads be included in the final annotation for metatranscriptome data, even from highly diverse environments as the resulting annotations should lead to a more accurate reflection of the transcriptional behaviour of the microbial population under investigation
Spaces of phylogenetic networks from generalized nearest-neighbor interchange operations
Phylogenetic networks are a generalization of evolutionary or phylogenetic trees that are used to represent the evolution of species which have undergone reticulate evolution. In this paper we consider spaces of such networks defined by some novel local operations that we introduce for converting one phylogenetic network into another. These operations are modeled on the well-studied nearest-neighbor interchange (NNI) operations on phylogenetic trees, and lead to natural generalizations of the tree spaces that have been previously associated to such operations. We present several results on spaces of some relatively simple networks, called level-1 networks, including the size of the neighborhood of a fixed network, and bounds on the diameter of the metric defined by taking the smallest number of operations required to convert one network into another.We expect that our results will be useful in the development of methods for systematically searching for optimal phylogenetic networks using, for example, likelihood and Bayesian approaches
Characterizing block graphs in terms of their vertex-induced partitions
Block graphs are a generalization of trees that arise in areas such as metric graph theory, molecular graphs, and phylogenetics. Given a finite connected simple graph with vertex set and edge set , we will show that the (necessarily unique) smallest block graph with vertex set whose edge set contains is uniquely determined by the -indexed family \Pp_G =\big(\pi_v)_{v \in V} of the partitions of the set into the set of connected components of the graph . Moreover, we show that an arbitrary -indexed family \Pp=(\p_v)_{v \in V} of partitions \p_v of the set is of the form \Pp=\Pp_G for some connected simple graph with vertex set as above if and only if, for any two distinct elements , the union of the set in \p_v that contains and the set in \p_u that contains coincides with the set , and \{v\}\in \p_v holds for all . As well as being of inherent interest to the theory of block graphs,these facts are also useful in the analysis of compatible decompositions of finite metric spaces
- …